Picture for Yan Teng

Yan Teng

SentGuard: Sentence-Level Streaming Guardrails for Large Language Models

Add code
Jun 01, 2026
Viaarxiv icon

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Add code
May 28, 2026
Viaarxiv icon

Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs

Add code
May 20, 2026
Viaarxiv icon

Towards Context-Invariant Safety Alignment for Large Language Models

Add code
May 20, 2026
Viaarxiv icon

Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence

Add code
May 07, 2026
Viaarxiv icon

Mechanistic Origin of Moral Indifference in Language Models

Add code
Mar 16, 2026
Viaarxiv icon

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

Add code
Mar 02, 2026
Viaarxiv icon

From Sparse Decisions to Dense Reasoning: A Multi-attribute Trajectory Paradigm for Multimodal Moderation

Add code
Jan 28, 2026
Viaarxiv icon

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

Add code
Jan 04, 2026
Viaarxiv icon

UniMark: Artificial Intelligence Generated Content Identification Toolkit

Add code
Dec 13, 2025
Viaarxiv icon